Learning-Aided Control in Stochastic Queueing Systems
ثبت نشده
چکیده
In this paper, we investigate the power of online learningin stochastic network optimization with unknown systemstatistics a prior. We are interested in understanding howinformation and learning can be efficiently incorporated intosystem control techniques, and what are the fundamentalbenefits of doing so. We propose two Online Learning-AidedControl techniques, OLAC and OLAC2, that explicitly utilizethe past system information in current system control viaa learning procedure called dual learning. We prove strongperformance guarantees of the proposed algorithms: OLACand OLAC2 achieve the near-optimal [O( ), O([log(1/ )])]utility-delay tradeoff and OLAC2 possesses an O( −2/3) con-vergence time. OLAC and OLAC2 are probably the first al-gorithms that simultaneously possess explicit near-optimaldelay guarantee and sub-linear convergence time. Simula-tion results also confirm the superior performance of theproposed algorithms in practice. To the best of our knowl-edge, our attempt is the first to explicitly incorporate onlinelearning into stochastic network optimization and to demon-strate its power in both theory and practice.
منابع مشابه
The achievable region method in the optimal control of queueing systems; formulations, bounds and policies
We survey a new approach that the author and his co-workers have developed to formulate stochastic control problems (predominantly queueing systems) as mathematical programming problems. The central idea is to characterize the region of achievable performance in a stochastic control problem, i.e., find linear or nonlinear constraints on the performance vectors that all policies satisfy. We pres...
متن کاملMulti-Objective Lead Time Control in Multistage Assembly Systems (TECHNICAL NOTE)
In this paper we develop a multi-objective model to optimally control the lead time of a multistage assembly system. The multistage assembly system is modeled as an open queueing network, whose service stations represent manufacturing or assembly operations. The arrival processes of the individual parts of the product, which should be assembled to each other in assembly stations, are assumed to...
متن کاملAlgorithmic challenges in the theory of queueing networks
The theory of queueing systems is traditionally considered as a branch of applied probability. Hence the toolkit used in the analysis of queueing system draws heavily on the theory of stochastic processes. Many problems in the area of queueing networks, however, are of algorithmic nature, and thus require algorithmic/complexity theoretic approaches. In this tutorial we will discuss several such...
متن کاملReinforcement Learning Methods for Continuous-Time Markov Decision Problems
Semi-Markov Decision Problems are continuous time generalizations of discrete time Markov Decision Problems. A number of reinforcement learning algorithms have been developed recently for the solution of Markov Decision Problems, based on the ideas of asynchronous dynamic programming and stochastic approximation. Among these are TD(,x), Q-Iearning, and Real-time Dynamic Programming. After revie...
متن کاملThe Achievable Region Approach to the Optimal Control of Stochastic Systems
The achievable region approach seeks solutions to stochastic optimisation problems by: (i) characterising the space of all possible performances (the achievable region) of the system of interest, and (ii) optimising the overall system-wide performance objective over this space. This is radically di erent from conventional formulations based on dynamic programming. The approach is explained with...
متن کامل